17 research outputs found

    Uncertainty-Aware Principal Component Analysis

    Full text link
    We present a technique to perform dimensionality reduction on data that is subject to uncertainty. Our method is a generalization of traditional principal component analysis (PCA) to multivariate probability distributions. In comparison to non-linear methods, linear dimensionality reduction techniques have the advantage that the characteristics of such probability distributions remain intact after projection. We derive a representation of the PCA sample covariance matrix that respects potential uncertainty in each of the inputs, building the mathematical foundation of our new method: uncertainty-aware PCA. In addition to the accuracy and performance gained by our approach over sampling-based strategies, our formulation allows us to perform sensitivity analysis with regard to the uncertainty in the data. For this, we propose factor traces as a novel visualization that enables to better understand the influence of uncertainty on the chosen principal components. We provide multiple examples of our technique using real-world datasets. As a special case, we show how to propagate multivariate normal distributions through PCA in closed form. Furthermore, we discuss extensions and limitations of our approach

    Human Reasoning in Visualization and Visual Analytics

    No full text
    The visualization of information is an astonishing cultural technique. It is employed in sev- eral scientific disciplines dealing with human reasoning, decision-making, and logical as well as statistical inference. Thus, visualization can be regarded as one of the most versatile forms of representation, and certainly one of the most ubiquitous ones. With the growing availabil- ity of digital computation power, visualization also became an interface between humans and machines, such as in visual analytics systems. Despite a long history of research from diverse angles and its wide application in practice, a unified broad theoretic foundation of why and how visualization facilitates human decision-making is missing. In this thesis, we contribute to the theoretic foundation of visualization, and apply some of our concepts to concrete examples. First, we construct a network of arguments that connects a wide range of theoretic arguments on why visualization works. Beyond collecting more than one hundred arguments, the net- work explicates dependencies between these arguments, as well as needs for trade-off between opposing arguments. With the network, we identify the hypothetical compromise between the specificity and the flexibility of visualizations. We conduct a first experiment on the unin- structed transfer between two probabilistic inference tasks to investigate this expected trade-off empirically. Our experiment provides results that are only partially in line with our expectation. Furthermore, we introduce a representational framework that disentangles the visualization processes as experienced by designers and viewers. In particular, we describe how viewers ben- efit from an intricate division of labor with designers. Optimally, designers provide transparent visualizations that fit the tasks at hand. In a second experiment, we detail on particular predic- tions of our representational framework on the tailoring of visualizations to Bayesian inference tasks, which are practically relevant in medical diagnosis and the evaluation of binary classifi- cation models. We find evidence that tailored visual representations can boost performance. However, additional experiments need to be undertaken in order to come up with a conclusive understanding of how representations alter inference processes. Methodologically, we high- light the importance of choosing appropriate performance measures for evaluating participants’ performance in tasks that promote highly structured error patterns. The commonly used mean absolute error measure can mislead in such scenarios, especially when comparing performance across tasks. Crucially, our representational framework extends beyond the presentation of a- priorily known answers to closed tasks, such as the positive predictive value in case of Bayesian inference tasks. When dealing with open-ended inference and decision-making tasks, there are no such well- defined solutions. Instead, tasks are typically ill-posed and potential solutions are notoriously uncertain. As a result, viewers increasingly become designers by following unforeseen reason- ing paths and by navigating through interactive visualizations and visual analytics systems. We present a conceptual workflow for joint model development as a theoretic basis of future visual analytics systems, and demonstrate its feasibility in the context of interactive regression anal- ysis. Throughout the modeling process human reasoning is crucial, for example, in the evalu- ation and comparison of candidate models along multiple objectives. We stress that reducing human involvement by the premature quantification and resolution of expected trade-offs risks limiting real-world performance and conflicts with the data-driven stance of machine learning. Applying our concepts to other contexts involving more open-ended tasks constitutes a ma- jor avenue for future research on visualization an visual analytics. Our network of arguments and our representational framework provide a broad foundation for empirical investigations of visualization. In the long run, decision-makers may be able to reason more effectively by utiliz- ing visualizations build on the joint expertise of interdisciplinary research that we facilitated by exploring its theoretic foundation.publishe

    Design Considerations on Glyph Placement Strategies

    No full text
    While glyph design is well researched and several placement techniques have been suggested, how to place glyphs in practice is not straightforward. Based on literature, we structure the problem space of glyph placement in three main categories: context-driven placement, placement of data collections, and placement of data samples. Following this categorization, we discuss several design considerations. Additionally, we highlight dependencies on task, user and data that prohibit the formulation of generally applicable guidelines.publishe

    Visual Analytics Framework for the Assessment of Temporal Hypergraph Prediction Models

    No full text
    Members of communities often share topics of interest. However, usually not all members are interested in all topics, and participation in topics changes over time. Prediction models based on temporal hypergraphs that—in contrast to state-of-the-art models—exploit group structures in the communication network can be used to anticipate changes of interests. In practice, there is a need to assess these models in detail. While loss functions used in the training process can provide initial cues on the model’s global quality, local quality can be investigated with visual analytics. In this paper, we present a visual analytics framework for the assessment of temporal hypergraph prediction models. We introduce its core components: a sliding window approach to prediction and an interactive visualization for partially fuzzy temporal hypergraphs.publishe

    The Biases of Thinking Fast and Thinking Slow

    No full text
    Visualization is a human-centric process, which is inevitably as- sociated with potential biases in humans’ judgment and decision making. While the discussions on humans’ biases have been heavily influenced the work of Daniel Kahneman as summarized in his book “Thinking, Fast and Slow” [8], there have also been viewpoints in psychology in favor of heuristics (e.g., [6]). In this paper, we present a balanced discourse on the humans’ heuristics and biases as the two sides of the same coin. In particular, we examine these two aspects from a probabilistic perspective, and relate them to the notions of global and local sampling. We use three case studies in Kahneman’s book to illustrate the potential biases of human- and machine-centric decision processes. Our discourse leads to a concrete conclusion that visual analytics, where interactive visualization is integrated with statistics and algorithms, offers an effective and efficient means to overcome biases in data intelligence.publishe

    ODIX : A Rapid Hypotheses Testing System for Origin-Destination Data

    No full text
    In this paper, we present our solution to the VAST Challenge 2017 Mini Challenge 1. We discuss challenges posed by data set and tasks and introduce ODIX, a custom rapid hypotheses testing system tailored to origin-destination data as provided by the challenge. We show findings made with ODIX and illustrate how we apply sequential pattern mining to explore common traffic patterns.publishe

    Using visual analytics to provide situation awareness for movement and communication data

    No full text
    Analyzing and correlating movement and communication data is challenging. To gather insights and gain knowledge out of such datasets, we propose a visual analytics system. We apply automated clustering techniques and propose a combination of various visualizations to provide overviews. We support the analyst in exploring the data to eventually enhance situational awareness in complex analysis scenarios. To evaluate our approach, we apply our techniques in the context of the VAST 2015 Grand Challenge (GC). Within this challenge, we successfully identify suspicious patterns and interesting groups of distinctive behavior among visitors of an amusement park, and correlate them with their respective communication patterns to gain insights.publishe

    Why Visualize? : Untangling a Large Network of Arguments

    No full text
    Visualization has been deemed a useful technique by researchers and practitioners, alike, leaving a trail of arguments behind that reason why visualization works. In addition, examples of misleading usages of visualizations in information communication have occasionally been pointed out. Thus, to contribute to the fundamental understanding of our discipline, we require a comprehensive collection of arguments on "why visualize?" (or "why not?"), untangling the rationale behind positive and negative viewpoints. In this paper, we report a theoretical study to understand the underlying reasons of various arguments; their relationships (e.g., built-on, and conflict); and their respective dependencies on tasks, users, and data. We curated an argumentative network based on a collection of arguments from various fields, including information visualization, cognitive science, psychology, statistics, philosophy, and others. Our work proposes several categorizations for the arguments, and makes their relations explicit. We contribute the first comprehensive and systematic theoretical study of the arguments on visualization. Thereby, we provide a roadmap towards building a foundation for visualization theory and empirical research as well as for practical application in the critique and design of visualizations. In addition, we provide our argumentation network and argument collection online at https://whyvis.dbvis.de, supported by an interactive visualization.publishe

    Perspectives on the 2Ă—2 Matrix : Solving Semantically Distinct Problems Based on a Shared Structure of Binary Contingencies

    No full text
    Cognition is both empowered and limited by representations. The matrix lens model explicates tasks that are based on frequency counts, conditional probabilities, and binary contingencies in a general fashion. Based on a structural analysis of such tasks, the model links several problems and semantic domains and provides a new perspective on representational accounts of cognition that recognizes representational isomorphs as opportunities, rather than as problems. The shared structural construct of a 2Ă—2 matrix supports a set of generic tasks and semantic mappings that provide a unifying framework for understanding problems and defining scientific measures. Our model's key explanatory mechanism is the adoption of particular perspectives on a 2Ă—2 matrix that categorizes the frequency counts of cases by some condition, treatment, risk, or outcome factor. By the selective steps of filtering, framing, and focusing on specific aspects, the measures used in various semantic domains negotiate distinct trade-offs between abstraction and specialization. As a consequence, the transparent communication of such measures must explicate the perspectives encapsulated in their derivation. To demonstrate the explanatory scope of our model, we use it to clarify theoretical debates on biases and facilitation effects in Bayesian reasoning and to integrate the scientific measures from various semantic domains within a unifying framework. A better understanding of problem structures, representational transparency, and the role of perspectives in the scientific process yields both theoretical insights and practical applications.publishe
    corecore